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20.  Abstract. 


A  statistical  procedure  is  asymptotically  robust  if  its  large-sample  properties  hold 
under  conditions  more  general  than  the  conditions  under  which  the  procedure  is  derived. 
The  justification  of  such  properties  is  often  based  directly  or  indirectly  on  a  central  limit 
theorem.  In  this  paper  a  form  of  the  Lindeberg  condition  appropriate  for  martingale  dif¬ 
ferences  is  used  to  obtain  consistency  and  asymptotic  normality  of  statistics  for  regression 
and  autoregression.  The  regression  model  is  yt  =  Bzt  +  vt.  The  unobserved  error  sequence 
{u(}  is  a  sequence  of  martingale  differences  with  conditional  covariance  matrices  {17,}  and 
satisfying 

-  sup  £  {v'tvtI{v'tvt  >  a)\zt.vt-i.Zt-i - }  —  n 

»  t=l . 


$  V*  v  T ( *** 1 7* 

*■'  ^  t  f  L  ,  X  v  v  t  c  t 


as  a  —  rc.  The  sample  covariance  of  the  independent  variables.  Z\ . is  assumed 

to  have  a  probability  limit  M.  constant  and  nonsingular:  max,=1 . „z\zt/n  0.  If 

(l/n)^  27,  S-,  S,  constant,  then  \/n  vec  ( B  n  —  B)  — X(0.M~i  :£  27). 

The  autoregression  model  is  xt  =  Bxt~\  +  vt  with  the  above  conditions  on  {vt}  and 

1  n 

~  (St  Z  Vt- — *  Sr3(S  u,  27). 

71  t—* 

(  =  maxi  r.s  i  +  l 

where  dr3  is  the  Kronecker  delta.  Then  y/nvec{Bn  —  B)  A’(0.T_1  3  27).  where 

r  =  T.T=oB3s(b’)3. 
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1.  Introduction. 


A  statistical  procedure  is  asymptotically  robust  if  its  large-sample  properties  hold 
under  conditions  more  general  than  the  conditions  under  which  the  procedure  is  derived. 
The  justification  of  such  procedures  is  often  based  directly  or  indirectly  on  a  central  limit 
theorem.  In  this  paper  Lindeberg-type  conditions  are  utilized  to  establish  asymptotic 
normality  of  sample  regression  and  autoregression  coefficients.  '  ^ 

The  classic  central  limit  theorem  for  independent  identically  distributed  scalar  random 
variables  X1.X2. . .  .  states  that  y/n  xn  ■ — +  Ar(0, a2)  as  n  — >  oo  if  £xt  —  0  and  £x2  =  a2: 
here  xn  —  Xi/n  is  the  mean  of  the  first  n  observations.  The  requirement  that  the 

variables  be  identically  distributed  can  be  dropped.  For  £x{  =  0  and  £x 2  —  a2, 

(1-1)  0,1), 

7~n 

. ■  ...  -  /. 

where 

/  Jo'  - 


t=l 


if  for  any  given  s  >  0 

(1.3)  \^£x2I{x2  >  er2) — >0 

T n  !=1 

as  ?!  -+  oo.  Here  /(•)  is  the  indicator  function.  If  cr„/r2  — >  0  as  n  — ►  oo,  then  (1.1)  implies 

(1.3) ;  in  this  sense  the  Lindeberg  (1922)  condition  (1.3)  is  minimal. 

The  condition  of  independence  can  be  weakened  to  a  condition  of  martingale  differ¬ 
ences.  A  very  general  theorem,  which  we  shall  use.  has  been  given  by  Dvoretzkv  (1972). 
For  justification  of  later  theorems  we  state  this  result  in  terms  of  a  triangular  array  of  ran¬ 
dom  variables  (and  include  a  normalization  in  the  definition  of  the  random  variables). 

Theorem  (Dvoretzkv).  Let  xn] _ .x„„  be  a  set  of  random  variables  and  JF„0  c 

J-n\  C  •  •  •  C  Tnn  be  a  set  of  cr-fields,  n  =  1,2, ... ,  such  that  xnj  is  Tn: -measurable. 

(1.4)  £{xnj\Fn,i-i)  =  0  a.s.. 


(1.5) 


£(x2„J\Tn^1) 


a.s.. 


1 


(1.6) 


n 


as  n  — >  0,  where  cr2  is  constant,  and  for  any  given  e  >  0 

(l-n  Ji,  °. 

(=1 

Then 

n 

(1.8)  A'(0,a2). 

j=i 

Dvoretzkj'  actually  showed  that  this  result  holds  if  Tn,i-\  is  replaced  by  j,  the 

(7-field  generated  by  Yli=  i  xni •  Generalizations  have  been  given  in  Section  3.2  of  Hall  and 
Heyde  (1980)  and  Section  9.5  of  Chow  and  Teicher  (19SS).  Further  references  can  be  found 
in  these  books. 

In  this  paper  we  consider  the  estimation  of  the  matrix  of  regression  coefficients  B  in 
the  model 

(1-9)  yt  =  Bzt  +  vt,  t  =  1,2,...  , 

where  the  unobservable  vector  disturbances  vt  are  martingale  differences;  that  is,  the 
conditional  expected  value  of  vt  given  earlier  observed  yt' s  and  zt's  is  0.  The  conditional 
second-order  moments  of  the  vf's  are  finite,  but  not  necessarily  the  same  for  all  t.  However, 
the  vt' s  satisfy  a  kind  of  Lindeberg  condition.  The  “independent"  variables  zt  are  assumed 
to  have  a  sample  covariance  matrix  that  converges  to  a  limit  in  probability,  and  the  zt' s 
satisfy  a  kind  of  asymptotic  negligibility  condition.  It  is  shown  that  the  least  squares 
estimator  of  B  has  an  asymptotic  distribution  that  is  the  same  as  in  the  case  that  the 
vt's  are  independent  and  normal  with  mean  0  and  constant  covariance  matrix.  Thus  the 
disturbances  do  not  need  to  be  homoscedastic  nor  do  they  need  to  be  independent.  The 
relaxed  conditions  are  particularly  important  when  the  observed  Z(’s  and  yt's  constitute  a 
time  senies. 

In  the  autoregressive  model,  which  is  extensively  used  in  time  series  analysis, 

(1.10)  xt  =  Bxt-i  +  vt,  t  =  1,2,..., 


2 


the  vector  zt  is  replaced  by  The  conditions  on  the  vt's  imply  the  desired  conditions 

on  the  xt-i "s. 

In  Section  4  the  mixed  model  is  considered;  the  right-hand  side  may  contain  both 
lagged  "dependent”  variables  and  independent  variables. 

If  the  disturbances  in  the  regression  model  are  normal,  independent,  and  homoscedas- 
tic.  and  the  independent  variables  are  nonstochastic,  the  estimator  of  B  has  a  normal 
distribution  with  expected  value  B  and  covariances  determined  by  the  common  covariance 
matrix  of  the  disturbances;  it  follows  that  the  asymptotic  distribution  is  normal.  The  re¬ 
striction  of  homoscedasticity  was  relaxed  by  Anderson  (1971)  in  Theorems  2.6.1  and  2.6.2 
under  a  Lindeberg-type  condition  on  the  disturbances  and  the  condition  that  the  sample 
covariance  matrix  of  the  independent  variables  have  a  nonsingular  limit. 

In  the  autoregression  model  the  least  squares  estimator  of  B  is  nonlinear  in  the  dis¬ 
turbances.  Mann  and  Maid  (1943)  showed  that  the  asymptotic  distribution  of  the  estima¬ 
tor  of  B  is  normal  under  the  condition  that  the  disturbances  are  independently  identically 
distributed  and  possess  moments  of  all  orders.  Anderson  (1959)  showed  that  in  this  case 
only  the  second-order  moments  need  to  be  finite. 

There  are  many  recent  results  in  this  area.  Lai  and  Robbins  (19S1)  proved  a  theorem 
for  a  scalar  dependent  variable  with  independent  identically  distributed  disturbances.  Lai 
and  Wei  (19S2)  proved  a  similar  theorem  under  the  conditions  that  the  moments  of  the 
disturbances  of  some  order  greater  than  2  are  bounded  and  that  the  variances  of  the 
disturbances  converge  to  a  constant  a.s.  Our  approach  follows  these  papers,  but  the 
conditions  have  been  relaxed.  Chan  and  Wei  (1987)  have  used  a  Lindeberg  condition  for 
a  special  case  of  the  autoregressive  process;  see  also  Lai  and  Siegmund  (1983). 

2.  Robustness  in  Regression. 

We  consider  the  regression  model  in  which  the  observed  vector-valued  dependent 
variable  yt  is  generated  by 

(2.1)  y,  =  Bzt  +vt,  t  =  1,2,...  , 

where  zt  is  an  observed  vector-valued  independent  variable  and  {tq}  is  a  sequence  of 
(unobservable)  martingale  differences  satisfying  a  Lindeberg-type  condition. 
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Theorem  1.  Let  {zf.r,}.  t  =  1,2,. . . ,  be  a  sequence  of  pairs  of  random  vectors,  and 
let  {ft}  be  an  increasing  sequence  of  <7 -fields  such  that  zt  is  ft-\- measurable  and  vt  is 
^-measurable.  Let  the  matrix  Dn  be  -measurable  such  that 

n 

(2.2) 

t=i 

a  constant  matrix,  as  n  — >  00,  and 


(2.3)  max  z't{DnD'n)  ‘z^O. 

t=  1 . T1 

Suppose  further  that  £(vt\ft-i)  =  0  a.s.,  £{vtv't\ft-\)  =  17<  a.s., 

n 

(2.4)  -^27®C. 

(=1 

where  17  is  a  constant  positive  semidefinite  matrix,  and 


(2.5) 

as  a  — »  oc  .  Then 


(2.6) 


sup  £[v'tvtI{v'tvt  >  a)\ft-\]  0 

t= 1,2,... 


jV(0,I7<g)C). 


Proof.  The  conclusion  holds  if 

n  n 

(2.7) 


(  =  1 


t  =  l 


Ar(0,tr  UBCB') 


for  every  B.  Let  unt  =  BDn  1  zt,  t  =  1, . . . ,  n.  Then 

n 

(2.8) 

*=i 

say.  We  want  to  show  that 

n 

(2.9)  y^  u'ntvt  -£->  Ar(0,tr  ED). 

1=1 
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Condition  (2.3)  implies 


(2.10) 


I  P  A 

max  untu„t  — >  0. 

t=l 


(2.11) 


Wnt  =  unf/(||wn(||  <  1),  t  =  1 . r?,  n  =  1.: 


Then  ||u?n(||  <  1  and 


(2.12) 


Pr  {wnt  =  unt,  t  =  1 }  — >1 


as  n  — >  oc. 


Now  we  shall  verify  that  xnf  =  w'ntvt  satisfy  the  conditions  of  Dvoretzky's  theorem. 


We  have 


(2.13) 


£{w'ntVi  |^,_i )  =  w'^Siv^Ft-i)  =  0  a.s.. 


(2.14) 


^£'[(t«/nft;()2|jr(_1]  =  '^Tw'ntEtWnt  tr  ED 


by  (2.4).  The  third  condition  for  {inn<}  to  satisfy  is 


(2.15)  An(6)  =  Y/£{«tVm)2l[(w'ntvnt)2  >  6]  \Tt.,  }^0W>0; 


that  is,  given  8  >  0.  s  >  0,  and  7  >  0,  there  exists  n(e, 7)  such  that  for  n  >  n  ( e ,  7 ) 


(2.16) 


We  have 


Pr{-4n(6)  <  e}  >  1  -  7. 


an,  {te")  '[(& 

n  (  8  1 

<  J2W'"iWnt£  |^V(J  V'tVt  >  ||Wnt||2  ^ 


wnt\\Vt)  >  \\w„t\\2 


5 


Given  5*  >  0  and  7*  >  0  there  exists  n *(•*.'< 


)  such  that  for  r?  > 


(2. IS) 


Pr{||u’„(||2  <s\t  =  l . n  }  >1-7*. 


Hence 

(2.19) 

Since 

(2.20) 


Pr  j.4„(<!>)  <  ^  w'ntwnt£  v'(vtI^v'tv,  > 
f  w'nlwu,S  juju, I  Tt~ i| 


Ft- , 


>1-9*- 


<  Y2  XnfX"'  SU P  £  <  V'sV »!  (  V'*V s  >  ~ 


<=1 


=  Bn[- 


say.  That  is. 


>.21) 


Pr  <  A„(6)  <  B„  (  -  f  >  1  -9* 


if  n  >  n*  (c*.  9*).  Let 


(2.22)  C{d)=  sup  £[v^i>sJ(t7'r,  >  cOI^-i]. 

s  =  1 ,2 _ 

Condition  (2.5)  is  that  given  e  >  0,  5  >  0  there  exists  a  d{e,  )  such  that  for  d  >  d{t ,  7 ) 


(2.23) 


Pr  |C(d)  <  e}  >  1  —  -7. 


Condition  (2.2)  implies  that  given  a  >  0,  7  >  0  there  exists  u\a.  7)  such  that 

n 

(2.24)  Pr  {  x'ntxnt  <  tr  D  +  a)  >  1—9. 

<=1 


Hence 


(2.25) 


Pr  <  Bn 


<1-9-5 
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if  ( tr  D  -t-  a  )t  <  s.  t~  /  z  *  >  d{( .  7  ).  and  n  >  fi(a.  7 ).  Then  (2.16)  holds  if  7*  +  7  -f  7  <7. 
t  tr  D  —  «!f  <  f.  c*  <  j  d\  f .  *  ).  and  n  >  max  [n*(  5*.  7*  ).  n(o.  7  )] .  The  theorem  follows 
from  the  theorem  m  the  introduction  [Dvoretzky  (1972)].  [See.  also.  Corollary  3.1  of  Hall 
and  Heydc  1  1 9 S 0 )  or  Tlieorem  2.  Section  9.5.  of  Chow  and  Teicher  (1988).]  | 


Theorem  2.  Let  {r(}  he  a  sequence  of  random  vectors  and  let  {Tt}  be  an  increasing 
sequence  of  rr  fields  such  that  v,  is  Tx  -measurable.  £{vt\Tt~ 1 )  =  0  a.s..  £(v,v't  )  —  St 

a.s.. 


(2. 26  1 


constant,  and 


Then 


“  zL  v'ivt  >  nz^St-i]  0. 


l  2.2S  i 


-Vv,v't  S. 

n  ' 

1  =  1 


Proof.  If  i't  is  scalar,  the  proof  follows  from  Theorem  2.23  of  Hall  and  Heyde  (19S0) 
as  indicated  by  Chan  and  Wei  (19S7).  The  theorem  is  then  verified  by  taking  arbitrary 
linear  combinations  of  iq.  | 

Theorem  3.  For  n  observations  on  the  model  (2.1)  define 

n  /  n 

( 2.29  * 

t=  1  \f  =  1 


.30) 


1  " 

S„  =  -  y^ly,  -  B„z, )( yt  -  B„z,)' 

77  L - ' 


=  -  y:  Viv',  -  -16,,  —  B  )  ^  z,  z\(Bn  -  B)' . 


1=  1 


r=i 


( 


If  the  conditions  of  Theorem  1  hold  with  C  nonsingular,  then 


(2  U)  vec  [(Bn  -  B)D„]  A^O.C”1  ®  E). 

If,  further.  (2.26)  holds,  then 

(2.32)  En  E. 

Proof.  The  proof  of  (2.31)  is  a  straightforward  application  of  Theorem  1.  The  second 
term  on  the  right-hand  side  of  (2.30)  is 

(2-33)  i(£i„  -  B)D-'  [£>-'£  z,z[(D'„Y'}  [(B„  -  B)D~'}'  -L.  0 

t=1 

by  (2.2)  and  (2.31).  | 

The  purpose  of  condition  (2.3)  is  to  assure  asymptotic  negligibility  of  zt v\.  What 
alternative  conditions  imply  (2.3)? 

Lemma  1.  Let  {2,}  be  a  sequence  of  random  vectors,  and  let  {Tt}  be  an  increasing 
sequence  of  tr-fields  such  that  zt  is  jF(-measurable.  Let  Dn  be  JFo -measurable  such  that 
D~x  —>  0  a.s..  DnD~+x  I  a.s..  and 

n 

(2.34)  D;1Y,*t*’,(D'n)~1  ~*C  a.s. 

/= 1 

Then 

(2.35)  max  z't(DnD'  )_1  zt  — »  0  a.s. 

<=!,... ,n 


Proof.  From  1,2.34)  we  have 

n+  1  n 

(2.30)  D~l,  Y.  *<*!(»«+■ -  d;'  Y 

t=z  1  t=  1 

n+1 

=  D-‘zn+1z’„+l(D'„r'  +  D~l,Y (DU,)-1 


f=l 


n  +  l 


y  0  a.s. 


t-i 


8 


That  is.  ||£>„  1r,(+i|jJ  — >  0  a.s.  This  implies  (2.35)  by  the  proof  of  Lemma  2.6.1  in  Anderson 
(1971).  | 

A  special  case  of  {z,}  is  that  of  zt  nonstochastic;  then  (2.34)  (which  is  the  same  as  (2.2) 
when  {z(}  is  nonstochastic)  implies  (2.35)  with  the  limits  nonstochastic.  In  particular, 
if  D  n  is  diagonal  and  the  j-th  diagonal  element  of  Dn  is  the  square  root  of  the  sum 
of  squares  of  the  j- th  elements  of  the  zt' s,  then  D~l  z,z',(  D'n  )-1  is  the  correlation 

matrix  of  Zj . zn.  The  theorem  in  this  case  is  a  relaxation  of  Theorems  2.G.1  and  2.6.2 

of  Anderson  ( 1971 ). 


Theorem  4.  Let  {zf}  be  a  sequence  of  random  vectors,  and  let  {J-t}  be  an  increasing 
sequence  of  cr-fields  such  that  z,  is  P) -measurable  and 

n 

(2.37)  Y,£{S'AD"Dn)-lz<IWD«D'n)~l*t>s}  -^0. 

<=1 

Then  (2.3)  holds. 

Proof.  We  use  Lemma  3.5  of  Dvoretzky  (1972):  If  {J Ft}  is  an  increasing  sequence  of 
cr-fields  and  .4(  €  P*.  then  for  every  i]  >  0 


(2.38) 


Pr  <  (J.4«|.Fo  >  <  Tj  +  Pr  <  ^  P(At\jrt-1)  >  t]\p0  >  . 


For  every  e  >  0.  ?/  >  0 


(2.39)  Pr  |  t™axn  zt(DnD'n  )_1  Zf  >  ejP"o  |  =  Pr  j  Q  [z't(DnD'n  )~l zt  >  e |P0]  } 

<  1]  +  Pr  |  Pr  (z'tiDnDl)-1  zt  >  e|P<-i)  >  f?|Po| 

<  r,  +  Pr  [  ^jr£[z’l(DnD'nr1ztl[z'i(DnDln)-lzt  >  e\Ft-i]  >  r/|P0} 

by  a  form  of  TchebychefF’s  inequality.  By  (2.37)  the  right-hand  side  of  (2.39)  converges  to 
0.  Sibce  r/  is  arbitrary.  (2.3)  holds.  I 
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Corollary  1.  Let  {z/.  ip},  t  =  1.2,...,  be  a  sequence  of  pairs  of  random  vectors,  and 
let  {Tt}  be  an  increasing  sequence  of  cr-fields  such  that  zt  is  Tt-\ -measurable  and  vt  is 
-measurable.  Let  Dn  be  Tq -measurable  such  that  (2.2)  and  (2.37)  hold.  Suppose  that 
c'ftq i^-i )  =  0  a.s..  =  St  a.s..  and  (2.4)  and  (2.5)  hold.  Then  (2.6)  holds. 

The  condition  (2.4)  determines  the  limiting  covariance  matrix  of  D~  1^"_1z(tq. 

Lemma  2.  Let  {z/,  v,}  be  a  sequence  of  random  vectors,  and  let  {X}  be  an  increasing 
sequence  of  cr-fields  such  that  zt  is  Tt~\ -measurable  and  vt  is  -measurable  such  that 
5(r/|jFf_j)  =  0  a.s.,  £(vtv't\ Tt~\)  —  St  a.s.,  and  St  — >  S  a.s.,  where  S  is  a  constant 
matrix.  Suppose  Dn  is  So -measurable  such  that  (2.2)  holds.  Then  (2.4)  and  (2.26)  hold. 
If.  further.  (2.3)  and  (2.5)  hold,  then  (2.6)  holds. 

The  homoscedastic  case.  St  =  S,  is  included  and  also  the  case  of  St  nonstochastic. 

An  important  case  of  {zt}  is  that  in  which  Dn  —  \/n  I;  then  D~i^2^_1ztz'l(D'n)  1  = 
(1  / njY^=lZtz't\  that  is.  this  matrix  is  simply  the  sample  covariance  matrix  for  known 
mean  0. 


Corollary  2.  Let  {zt,  r(}  be  a  sequence  of  pairs  of  random  vectors  and  let  be  an 
increasing  sequence  of  <7~fields  such  that  zt  is  Tt-\ -measurable  and  vt  is  ^/-measurable. 
Suppose 

1  " 

(2.40)  -  V  z,z'(  M. 

n  z—j 

t=  l 

a  constant  matrix. 


(2.41) 


1  '  P  n 

—  max  z,zt  — ►  0. 
n  (=i . n 


£(vt\St-i )  =  0  a.s..  £(vt v't\St-t )  =  St  a.s., 

1  " 

(2.42)  -  Y (S,  eztz't)  SQ  M. 

ii  ' 
f=i 


and  (2.5)  holds.  Then 
(2.43) 


1 


zxv\ 


c 


X(0.S 


M): 
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if.  further.  M  is  nonsingular,  then 

(2.44)  07  vec(B„  -  B)  A'fO.il/"1  ®  27); 
and  if,  further.  (2.26)  holds,  then  (2.32)  holds. 

Condition  (2.40)  is  equivalently  (1/n)  £"=1  vec  ztz't  vec  Af ;  (2.26)  is  equivalently 
(l/n)vec  .S',  -£->  vec  i7;  and  (2.42)  is  equivalently 

1  . T!  1  n  ( i  n  \ 

(2.45)  —  ^  vec  JCt(vec  ztz't)' - ^  vec  (  ~~  vec  ®- 

77  t=i  n  t=i  \n  t=i  / 

The  condition  (2.45)  is  that  vec  Bt  and  vec  ztz't  are  asymptotically  uncorrelated  over  t. 
Even  if  the  Ut's  are  nonstochastic  and  the  zt  are  exogenous  this  condition  is  needed  to 
obtain  U  &  M  as  the  covariance  matrix  of  (l/07)vec  ztvt- 

3.  Robustness  in  Autoregression. 

We  now  consider  the  autoregressive  model. 

(3.1)  xt  =  Bxt-i  +  vt,  t-  1,2,...  . 

The  form  of  (3.1)  is  (2.1)  with  zt  replaced  by  xt-i-  We  shall  show  that  the  least  squares 
estimator  of  B  based  on  xo,. . .  ,x„  has  the  asymptotic  normal  distribution  of  the  least 
squares  estimator  in  the  regression  case.  In  order  to  show  the  analogies  to  (2.2)  and  (2.3) 
we  prove  the  following  lemmas. 

Lemma  3.  If  the  characteristic  roots  of  B  are  less  than  1  in  absolute  value  and  if 
max(=i . n  v'fVt/n  0,  then  for  x\ ,  *2,  •  •  •  generated  by  (3.1) 

(3.2)  —  max  x'.jXt- 1  0. 

n  t=  l,...,n 

Proof.  Since  x'0x0/n  — £->  0  and  the  roots  of  B  are  less  than  1  in  absolute  value, 
x'q{B'  Bl~l  Xq  /  n  -£->  0  and  we  need  only  consider 

t~  2 

(3.3)  x*_i  =  Bsvt-i-s. 

5=0 
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Then 


t-2 

(3.4)  x?-xxUi  =  E  v't_r_l{B,YBavt.s.i 

r,s—  0 

r,s=0 

<  Y  A’-+vp-1.<p-1(IW.-r-1|!2  +  IK— .II2). 

r.s=0 

where  A  is  the  largest  absolute  value  of  the  characteristic  roots  of  B  and  q  is  a  suitable 
constant.  (See  Lemma  7  in  the  appendix.)  Then 


'n- 2 


(3.5) 


—  max  ||se*_i||2  <  —  max  \\vt\\2  (  V'  Xs  sr  1 
n  t= i n  n  <=l,...,n  \ 

\  s=0 


Since  the  sum  in  (3.5)  is  bounded  as  n  — >  oc,  (3.2)  follows. 


I 


Lemma  4.  Let  x\,  x2, . . .  be  generated  by  (3.1)  with  and  £x0x'0  =  £0.  Let  {^}  be  an 
increasing  sequence  of  <r-fields  such  that  xt  and  vt  are  ^-measurable.  Suppose  the  charac¬ 
teristic  roots  of  B  are  less  than  1  in  absolute  value,  £{vt\Tt-\)  =  0  a.s.,  £{vtv't\Tt-\ )  —  £t 
a.s.,  (2.26)  holds  wdth  £  constant,  and  (2.27)  holds.  Define 

OO 

(3.6)  r  =  ^BS£{B')S. 

s=0 

Then  (2.28)  holds, 

(3.7)  1-^0’ 

n  i — / 

t=i 


(3.8) 


l 

n 


E*w*h 

(=i 


r. 


Proof.  From  (3.1)  we  have 

1-2 

(3.9)  xt_1=J2BSvt-i-s  + B'-'xo. 

t= o 
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For  some  6  >  0  define  xno  =  Xo. 


(3.10) 


v„t  =  v,I 


(3.11) 
Then 

(3.12) 

(3.13) 

(3.14) 


<  n(l  +  <?) tr  E 

3  =  1 

t- 2 

n — 1  =  Y  BSVn.t-  1-5  +  Bt~1Xn 0- 


3  =  0 


Pr  {vnt  =  vtJ  ~  1,...  ,n}  1. 

Pr  =  l,--.,n}  -T->  1, 

Pr  {rn(Xn  —  VtX(_1,t  —  1, . . .  ,n}  ♦  1* 


By  construction  c’||vn<||2  —  n(l  T$)tr  E  and  F||*n,<— 1||2  <  Then 


(3.15)  tr  £  T>„, Y^  iE,1,i-l1)n»j 


=  VtrE 

nz 


'vntXni^\Xn,a  —  \Vns 


s,t=\ 
n 


1  " 

=  — r  tr  €  ^  '  xn  t_iXn,3—ivn  sv„t 

TXl 

s,t=  1 

1  " 

=  — N  xn  t_iXn  g~iS  [vnsvnt\Emax(3,t)—i) 
rr  ' 

s+=i 
1  " 

=  -2  5  V  lT  • 

nz 


t= l 


Since  max(=i,...,n  lkt||2/r*  0  by  Theorem  4,  we  have  max(=i . n  T(||vnt  ||2 \Et-i)/n 


0  by  (3.6).  Now  consider  for  2  <  t  <  n  —  1 
1  " 

(3.16)  —  £  xn,t-lxn,t-l 

n  *■ — ' 


(=1 


=  -£ 
n 


n  /  t  —  2 


t- 2 


XoXo  +  Yj  (  Yu  Br vn,t-r-\  +  B*  1  *0  j  f  ^  BSVn,t-s-l  +  B  X0 

t  =  2  '  r=0  '  '  s=° 

=  i  YY'B'Sv., +  i  £ B'-'Sv(B')'-' 

U  t=2 s=0  '=1 

=  £S*I  £  - .<•* - ,(«')*+ 


s  =  0  <=5+2 


<=1 
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The  trace  of  the  first  term  on  the  right-hand  side  of  (3.6)  is  not  greater  than  (1  +  0)tv  J1. 
Hence.  (3.5)  — ►  0,  and  (3.7)  is  proved. 

From  (2.2S)  and  (3.1)  we  have 

(3.17)  ~  ^ ~2vtvt  =  -  ^  (*<*',  ~  xtx\-\B'  -  Bxtx +  B*f_ix(_j B') 


<=i 


t=i 


2. 


From  (3.7)  and  (3.1)  we  have 

1  n  !  n 

(3.18)  -Y'vtx't_i  =  -  y"(xtx't_1  -  Bxt_ iXf-x) 

n  n  *■ — ' 


t= l 


t= l 


0. 


If  we  add  to  (3.17)  the  result  of  multiplying  (3.18)  on  the  right  by  B1  and  the  transpose 
of  that  product,  we  obtain 


(3.19) 


f=l  (=1  t=l 

=  -  y^xtx't  -  B-'y'xt-ix,t_JB' 


(=1 

2. 

Furthermore,  Lemma  3  implies 

(3.20  -  )  xtxt - >  xt-ixt-i  =  ~xnxn - *o*o  — ♦  0 

t=  l  t= l 

Then  (3.19)  is  equivalent  to 

(3.21)  -Y  xtx\  -B~y  xtx'tB'  2, 
which  implies 

1  y”  1  -  ”  , 

(3.22)  r  =  plimn-00-  Y^XtX't  =  Plimn-oc-5Zsc,~i:c<-i- 


t= l 


t=i 
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See  Problem  27  of  Chapter  5  of  Anderson  (1971).  Then  (3.4)  follows. 


Theorem  5.  Let  Xj.io....  be  generated  by  (3.1),  where  Vj.v2,...  is  a  sequence  of 
random  vectors  and  £x0x'Q  —  Xo-  Let  {X,}  be  an  increasing  sequence  of  <7-fields  such  that 
xt  and  vt  are  .F,-measurable.  Suppose  that  the  characteristic  roots  of  B  are  less  than  1  in 
absolute  value.  £(Vt\Ft-\ )  =  0  a  s.  £{vtv't\ Tt-i)  =  X?  a.s.,  (2.26)  holds  with  X  constant, 
and  (2.5)  holds.  Furthermore,  suppose 

(3.23)  -  V  (X,£®M-r»!-i.l)-^WXSX), 

n  z— ' 

(=raax(  r,s)-t-2 

where  6S3  =  1  and  8rs  =  0  for  r  ^  s.  Then 


(3.24) 


—j=:  vec 


Proof.  In  Corollary  2  we  take  zf  =  xt-\.  We  want  to  verify  (2.40).  (2.41).  and  (2.42); 
(2.5)  is  assumed.  Since  (2.5)  implies  (2.27),  Lemma  4  includes  (3.S),  which  is  equivalent 
to  (2.40). 

We  have 

(3.25)  -VfXVxM*',-!) 

n 

<=i 

=  ~Y  X,s  +B,~1a:oN) 


If  we  define  v0  =  V-i  =  =  0,  we  can  write 


(3.26)  Y  Bsv,-s-i  +  Bi~1x0  =  Y  Bsvt-s-i  +  B1 


=  Y  BSvt-s-i  +  Y  B'vt-s-i  +  B*  JXo- 


s=k+l 


For  t  >  p  4-  1 


(3.27) 


IB'-'xoll  <  2A 2i1'l)q(t  -  l)p-1||®o||2. 
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Hence 


(3.28)  -J2[£t®  B'-'xox'oiB')'-1}  0. 

n  * — ' 
t= 1 

(See  Lemma  8  in  the  Appendix.) 

Consider  the  positive  semidefinite  matrix 


(3.29) 


1  " 

iE 

71  *  ** 


<=1 


r,3=fc+l 


We  shall  show  that  with  arbitrarily  high  probability  the  trace  of  (3.28)  is  arbitrarily  small 
if  k  is  large  enough.  That  will  follow  by  showing  the  same  property  of 


(3.30) 


iE 

t= l 


Erxt  ®  Y  BrVn<t-r-lv't-s-i(B'Y 
r,s=k  +  l 


where  Snt  =  ^’('Wn<l,n(l-^-i)-  The  expected  value  of  the  trace  of  the  second  matrix  in 
(3.30)  is 


(3.31)  £  Y  tr  5rDn,l-r-l<,(-,-l^5), 

r,s=k  + 1 


a=fc+l 

oo 

S  Y^J  ^  Q  s  P^'Vn1t-s-lVn,t-s-l 

s  =  fc  4- 1 

=  q*  Y  X2Ss2P£{£[Vn1t-s-lVn,t-s-lI{v'n<t.s.1Vnj-s-l  <a)\Tt-s-2} 
s=k+ 1  ^ 

<  q*  Y  A2ss2p{a  +  f  sup  >  a)|^(_i]  1  . 


Since  *+1  converges,  the  second  part  of  the  right-hand  side  of  (3.31)  can  be 

made  arbitrarily  small  by  taking  a  large  enough;  the  first  term  can  be  made  arbitrarily 
small  by  making  k  sufficiently  large.  Thus  (3.31)  is  arbitrarily  small,  and  by  TchebychefTs 
inequality  the  second  matrix  in  (3.30)  is  arbitrarily  small  with  arbitrarily  high  probability. 
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Now 


1  n  ~  k  1  k 

(3.32)  ~52  2*  ®  Z  Brvt-r-iv't_s_ .(B'Y  +  £®J2Ba;S(B'y- 

t=l  L  r,s=0  J  3=0 

If  the  right-hand  side  of  (3.26)  is  written  as  at  +  bt  +  ct,  we  have  shown  above  that 


(3.33) 


1  n 

-  52  st  ®  o 


and  that 


(3.34) 


1  ” 

tr  -  52  (•£<  ®  c«c<) 


can  be  made  to  converge  in  probability  as  n  — ►  oo  to  an  arbitrarily  small  quantity.  It 
follows  from  the  Cauchy-Schwarz  inequality  that 


(3.35) 


i  " 

-  52  (27,  ©  atc't)  4  0, 


(3.36) 


-  52  (27,  0  btc't)  -^0. 


and  that 


(3.37) 


i  £  (r,  ®  a,t;) 


can  be  made  to  converge  in  probability  to  an  arbitrarily  small  quantity.  Hence, 


(3.38) 


1  " 

—  52  [-^  ®  27  ®  r. 


Hence,  by  Corollary  2  (3.24)  follows. 
The  least  squares  estimator  of  B  is 


(3.39) 


Bn  —  55  xtxt~ i  ( 52  xt-ixt~ 
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and  the  estimator  of  X  is 


(3.40)  X„  =  -  Yjap  -  BnXt-^ixt  -  B nXt—i )' 

t= i 

1  "  1  n 
=  -  V  Vtv't  -(Bn-B)-T  xt^x'^iBn  -  B)'. 

t= l  t=  l 

Corollary  3.  Suppose  the  conditions  of  Theorem  5  hold  and  JT  is  nonsingular.  Then 

(3.41)  vWc {Bn  -  B)  -£->  NiO.T-1  ©17), 
and  (2.32)  holds. 

The  conditions  (3.23)  in  autoregression  replace  condition  (2.4)  in  regression;  they 
imply  (3.3S)  which  is  the  analog  of  (2.4).  The  limit  (3.38)  is  that  vec  Xf  and  vec  Xt~\x'i_1 
are  asymptotically  uncorrelated.  The  condition  holds  identically  in  B\  the  conditions 
(3.23)  are  independent  of  B. 

Corollary  4.  Under  the  conditions  of  Theorem  5  with  (2.26)  and  (3.23)  replaced  by 
Xf  — *  X  a.s.,  (3.24)  holds.  If  X  is  nonsingular,  (3.41)  and  (2.32)  hold. 

Proof.  The  condition  X(  — >  X  a.s.,  where  X  is  constant,  implies  (2.26)  and  (3.23). 
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A  higher  order  autoregressive  process  can  be  reduced  to  the  first-order  process.  Sup¬ 
pose  X i ,  X2 , .  •  •  satisfy 

(3.42)  Xi  —  B\Xt-\  +•••-(-  BpXt-p  +  Vj, f  =  1,2,...  . 


Define 


r  xt  -I 

rvn 

(3.43) 

xt  = 

Xt-i 

.vt  = 

0 

-  Xt-P+ 1 . 

.  0 . 

IS 


Bx 

b2 

b3  • 

•  Bp  1 

£2 1 

0 

0  • 

0- 

I 

0 

0 

0 

0 

0 

0 

0 

(3.44) 

B  = 

0 

/ 

0 

0 

.Xt  = 

0 

0 

0  • 

0 

.  0 

0 

0 

0  . 

.  0 

0 

0  • 

o. 

where  S(V,  \F~i )  =  0  a.s.,  £(VtVt'\Ft-i)  =  a.s.,  and  {Ft}  is  an  increasing  er-field  such 

that  Xt  and  Vt  are  .^-measurable.  Then  {®(}  satisfies  (3.1). 


Theorem  6.  Let 


(3.45) 


X-i 


IK-x'-i . = 

LX-P+J 

and  let  X\,  Xo. ...  be  generated  by  (3.42).  Let  {Ft}  be  an  increasing  sequence  of  <r-fields 
such  that  Xt  and  Vt  are  ^-measurable.  Suppose  the  roots  of 


(3.46) 


\\p  I  —  \p~l  B\ 


Bp  |  =  0 


are  less  than  1  in  absolute  value.  £(Vt\Ft-i )  =  0  a.s.,  £( Vt V{\Ft-\)  =  17(  a.s.. 

1  " 

(3.47) 

n  t=i 

which  is  nonsingular  and  constant,  and  (2.5)  holds  with  iq  replaced  by  Vt.  Define 

n 

(3.4i>)  {B  \n.  B2n*  •  •  •  i  B  pn)  — 


t=  1 


n  y  -wr  f  V  Y f 

L,t=i  A(_iA,_i  2^(=i  a'-i  Ar-2 

En  y  y  /  V  V” 1 

1=  1  — 1  2^f=l 

IZt^Xt-pX't-,  T.UXt-pX[_ 2 


ELi^-2x;_p 

Er=1^-Px;_p 


-i 


i  n 

(3.49)  fi n  —  ~  \  (Xt  —  B\nXt—\  —  •••  —  B pnXt—p)(Xt  —  B\nXt—\  —  ■  •  •  —  B pnXt—p)  ■ 

n  >  ■> 

Then 


/=! 


(3.50) 


«„  a 
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(3.5i ; 


say,  where 


(3.52) 


1  n 

-T, 

n  ^ 


*=1 


xt-n 

X  t—2 

lxt-p] 


[X't-,.x't_2 . X'p_p }  -X,J2bs^b'^s  =  r 


s= 0 


E  = 


fl  0  •••  0 
0  0-0 

0  0  0 


and 


(3.53)  y/n  vec( B \ n  ~  B\ . Bp„  -  Bp)  N(0.  T  1  £  fl). 


Lemma  5.  If  fl  is  nonsingular.  T  is  nonsingular. 

Proof.  The  proof  is  a  vector  generalization  of  the  proof  of  Lemma  5.5.5  of  Anderson 
(1971).  | 


4.  Robustness  in  Mixed  Regression  and  Autoregression 

Now  we  consider  the  model 

(4.1)  xt  =  Bxt-i  +  Azt  +  vt.  t  =  1,2 . 

This  model  is  analogous  to  the  regression  model  (2.1)  with  zt  replaced  by  (x't_l.z't)' .  The 
least  squares  estimator  of  (B,  A)  is 


(4.2) 


(Bn,A„) 


En  / 

f=i 

En  i 

«= 1  ZtXt- 1 


Eri  / 

<=i  x*-izt 

En  i 

(=1  Ztzt 


-1 


and  the  estimator  of  E  is 


(4.3) 


1  "  .  „  „  , 
E  n  —  —  ^  ’  (*(  —  Bnxt— i  Anzt)  (xt  BnXi  —  i  AnZi')  . 


/=i 


20 


Theorem  7 .  Let  =  270:  let  *1,  ar2.  ■  •  ■  be  generated  by  (4.1),  and  let  Z\ ,  z2. . . 

b(-  a  sequence  of  random  variables  (  possibly  degenerate).  Let  {Tt  }  be  a  sequence  of  increas 
ing  a  fields  such  that  v,  is  Tt  -measurable  and  zt  is  T\~\ -measurable.  Suppose  the  charac 
t eristic  roots  of  B  are  less  than  1  in  absolute  value,  £{vt \Tt-\ )  =  0  a.s.,  £(vtv't\ Tt-\ )  =  £ 
a.s..  and  (2.5).  (2.26).  and  (2.41)  hold.  Suppose 

^  n  —  h 

(4.4)  -  zt+hz't  Mh  =  M'_ h.  h  =  0.1.2..... 


Define 


-  V  z,+hv't  0.  h  =  1.2. 

71  ' 


L  =  ^  BfAM-{a. 


Then 


t  " 

1  V'  ~  p 
-  2- *'-■*<  — 


where  Q  is  the  unique  solution  to 


Q  -  BQB'  =  £  4-  BLA'  +  AL' B'  +  AM0A'. 


Furthermore,  if  (2.42)  and  (3.23)  hold  and 


(4.10) 


-T  (27,  ::r JUo.  * 

/  J  *  ^ 


1.2 . 


(4.11)  y/Tivcc{B„  -  B.An  -  A)  A  Qj,  ^ )  ;X  ^  • 


and  (2.32)  holds  under  the  further  assumption  that  the  inverse  matrix  in  (4.11)  exists. 


Proof.  Because  the  roots  of  B  are  less  than  1  in  absolute  value,  the  sum  in  (4.G) 
converges  (by  use  of  the  Cauchy-Schwarz  inequality).  From  (4.1)  we  obtain 

t-2  t- 2 

(4.12)  £f-i  =  ^  Bavt~ l-s  +  B*  *x0  +  'y Bs  Azt-i-s 

3  =  0  3  =  0 

k  oc 

=  'y  BSVt-i-a  +  y>  BSVt-i  -s  +  B1  *Io 

3=0  a=r+l 

r  oc 

J  =  0 

where  t’o  =  v~\  =  =  0  and  z0  =  =  •  •  •  =  0.  Then 

1  "  1  n  A 

(4.13)  ;e*-*;-;ee  B3{v,-i-.a  +  Azt-x-3)z't 


n  T  oc 

+  -  52  S,_1a:o2!+  51  Bs{v,-i-s  +  Az,-]-3)z'i 
f  =  l  L  s=k+ 1 


We  calculate  bv  use  of  Lemma 


(4.14)  ~  E  E  aw.*;  <iE  E  AV-1«"(||»,-i-,||2  +  ||2,||2) 

1  f=  1  a=lr+l  f=I  s=*+l 

oo  1  ” 

<?••  £  ww-Eowr  +  iM2)- 


+  imh- 


s=it+l 


Since  AV-1  converges  and  ]T"=j  ||zt||2/n  tr  M0.  we  can  choose  k  sufficiently 

large  to  make  the  right-hand  side  of  (4.14)  arbitrarily  small  with  arbitrarily  high  proba¬ 
bility.  Similarly  the  other  two  terms  in  the  second  sum  in  (4.13)  can  be  made  small.  Then 


(4.15) 


,  *  i  * 

-  52  +  Azt^-S)z't  -52  Bs  AM-k 

n  c '  n  ‘—J 


That  leads  to  (4.7). 

From  (4.1)  we  have 


1  n  i  n 

(4.1G)  -  Y'  vtv't  =  -  Y"  \xtx't  ~  Bxt-\x't  -  Aztx\ 

n  '  n  1 '  l 
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-  xtx't_1  B'  4  Bxt-\x't_}B'  4  Aztx'^^B1 


-  xt z'tA'  4  Bxt-\z[A'  4  Aztz'tA' 

Z+E. 


1  n  i  n 

(4-17)  “  H  VtX't -i  =  -  H  -  Bxi.,x't_1  -  Az,x 

t=i  t=i 


(4.18) 


1  i!  n 

~Y,vtz't  =  [z<z!  -  -  ^«<2; 

(=i  2  <=i 


If  (4.17)  0.  then  from  (4.16),  (4.17),  and  (4.18)  we  obtain 

1  ” 

(4.19)  -  52  (xtx't-  Bxt-ix't_iB'^ 

1  (=1 

1  ” 

=  —  [^(*<*,<  —  Bxtx'tB')  4  Bxnx'nB'  —  Bx0x'0B' 

<=i 

-^14  BZ4'  4  ^B'B'  4  AMqA'. 

If  (1  /n)x'„xn  -£->  0.  then  (4.8)  follows  from  (4.19).  Thus 

(4.20) 

Now  we  consider 


;E(xrH,w?  m, 


<=i 


(4-21 )  lajJ.t) 

n  z — ' 

(=i 

^  1 
n 

If  the  sums  in  (4.21)  on  r,  s  run  from  A'  4  1  to  oc,  the  trace  converges  to  an  arbitrarily 
small  quantity  by  taking  k  sufficiently  large.  Then 


E 

<=i 


r,s  =  0  \Vt-l-s/  \  1  / 


<4'22>  ip  E 

t=l  L  r,s=0  \  1  4  /  \  / 

* 

^  Br[^Ms_r^' 4  6r..,B](B')s. 


r,s  =  0 
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Thus 


(4.23i 

n  *■ — ' 

(=1 

=  EQQ. 


Y  BrAM3-rA\B')3  +  YB3Z(B'', 


Lr,s  =  0 


5  =  0 


By  similar  means  we  can  complete  the  proof  of 


(4.24) 


Theorem  1  can  then  be  applied  with  zt  in  Theorem  1  replaced  by  {x't_^ 
(4.11).  and  (2.33)  follows. 

To  apply  Theorem  1  we  also  need 


(4.25) 


1 

—  max 

n  1= l,...,n 


x 


t-i 


0. 


z[ Y  to  obtain 


To  prove  this  we  need  only  consider 

t- 2 

(4.26)  **_i  =  Bs(v t-i-a  +  Azt-i-s). 

3  =  0 


Then 

(4.27) 


* 


*r 

(-1 


X 


* 

t- 1 


Y  B9(vt- i_s  4-  Azt- i-s 

s  =  0 


t-2 

2 

t-2 

<  2 

YB9vt-i-s 

3  =  2 

+  2 

YB*Azt-ls 

5  =  0 

Bv  (3.4)  the  first  term  on  the  right-hand  side  of  (4.27)  is  less  than  or  equal  to 


(4.28) 


t- 2  t-2 

4  Y,  <*Y  Ar+VP_1sP_1  fmaxJ|rt||2. 

r,3=0  r,s=0 


Since  ||idz(_i_s||2  <  const  ||zt_i_sj|2,  we  obtain 


(4.29) 


t-2 

||*  II2  <  4  Y  A^r^-V-1 

r.  9  =  0 


(  q  max 

\  t= 1 n 


I|v,||2  +  q  * 


max 

t=  1 . n 
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which  implies  (4.25)  and  \\xn\\2 /n  — 0. 


Xow  we  want  to  show  that 


(4.30) 


From  (4.12)  we  have 


-  T  *t~ \v't  0. 

n  L ^ 


(4.31) 


V,Y.x‘-'v'  =  7Y.Y.b’v'—'v‘< 


n  t  —  2 


(=1  s=0 


n  1  —  2 


^E5'  'xov't  +  l^TY'B’Azt-^rv''. 


It  was  shown  in  Section  3  that  the  first  two  terms  on  the  right-hand  side  of  (4.31)  converge 
to  0  in  probability  as  n  — >  oc. 

Define  vni  by  (3.10)  and  zni  by 


(4.32) 


Then 


znt  =  ztI(\\zt\\2  <  n). 


n  1-2 


n  1-2 


(4.33)  Pr  |  ^  E  E  B'Azt—xv’t  =  ^  E  E  BsAznj.s. 


Consider 


f=l  3  =  0 


n  1-2 


1=1  3=0 


lVnt  }  -»  1. 


(4.34)  £  tr  (  i  E  BsAzn,t-s-!v'nt  J  (  ^  E  E  Br  Aznj-r-yv'nt  j 

V  (=1  3  =  0  /  \  1=1  3  =  0  / 

=  ^C  A*”-’—'- ■)  i^-l) 

1  n  1-2 

=  ^I£E  II  E  B‘Azn.M\\2S 

(=1  3  =  0 

1  n_1 

<S  max  ||2ns||2V  tr  A' (B')s  B9  A£(v'ntvnt\Ft-i) 

n  3=1 . n  ' 


because  ||zns||2/n  0  and  |j 2:^3 1|2  is  bounded  and  2Jt  X1  and  ||t?n(||2  is  bounded. 

This  proves  (4.20)  and  the  theorem.  I 


25 


Lemma  6.  If  assumptions  of  Theorem  7  hold  and  if  £  and  M0  are  positive  definite, 
then  (4.24)  is  positive  definite. 


Proof. 


(4.35)  <c\d')(®  M0)(d)=P +d'*')2 

i  n  r 

=  plim  -  V'  {c'vt^i)2  +  (c'Bxt~\  +  c'Azt-i  -f  d! zt  ] 
n~oc  n  f=i  L 

+2 c'vt-i(x't_2B'c  4-  z't_1A'c  +  z[d) 

1  n 

=  c  £c  +  plim  -  V (c1  Bxt-i  +  c'Azt-\  +  d' z,)2 

„  _ _  n  £ 


>  c' £c 


by  (4.3)  and  (4.30).  If  the  left-hand  side  of  (4.35)  is  0,  then  c  =  0  because  £  is  positive 
definite.  In  that  case  the  left-hand  side  of  (4.35)  is  d' Mod  =  0;  since  Mo  is  positive 
definite,  d  —  0.  | 

A  special  case  of  the  mixed  model  is  zt  =  1.  Then  (4.1)  is 


(4.36) 


where  7  =  A  or 


(4.37) 


xt  =  Bxt-i  +  7  +  vt, 


xt  -  n  =  B(xt- 1  -fi)  +  vt , 


where  7  =  (J  —  B)fx.  In  this  case  (2.41),  (4.4)  and  (4.5)  are  automatically  satisfied,  and 
condition  (4.10)  reduces  to 


(4.38) 


The  matrix  L  is 


(4.39) 


1  " 

- y^(£t @ th-i-s) 0.  5  =  0,1,.... 

77  Z — / 


L  =  ^Bs7  =  (/-B)-17, 
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and  the  matrix  Q  is 


(4.40)  Q  =  T  +  (I-  B)"377'(/-  B')~\ 

In  this  case 

(4.41) 

xt~i 

1=1  ) 

and  fxn  =  (I  —  Bn) 7n,  which  is  approximately  (1/n)  *<•  The  limiting  covariance 

mat-'x  of  y/n\[l/n)  ^"=1  xt  —  /x]  is 

(4.42)  (i  -  By'r  +  r(i  -  B'r1  -  r. 

The  condition  (4.5)  suggests  a  kind  of  lack  of  correlation  between  zt+h  and  v,  which 
is  plausible  if  {zt}  and  {r*}  are  independent;  that  is,  if  the  zt's  are  exogenous. 
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Appendix 


Lemma  7.  Let  the  largest  absolute  value  of  the  characteristic  roots  of  B  of  order  p 
be  A  <  1.  Then  for  any  vectors  u  and  v 

(A.l)  \u(B')r B3v\  <  A^V'-V^HI2  +  ||v||2) 

for  a  suitable  constant  q. 

Proof.  There  exists  a  matrix  P  such  that  B  —  P~l  HP,  where 


(A. 2) 


H  = 


r  Hi  o  ■  •  •  o 

0  h2  0 


L  o  o  •••  hk\ 

the  pk  x  pk  matrix  Hk  =  A*.. I  +  L*.,  A;,  is  a  characteristic  root  of  B ,  and 


(A. 3) 

Then 
(A. 4) 
Let 


(A. 5) 


(A. 6) 


0  0 

...  0  0\ 

1 

0  0 

...  0  0 

Lk  = 

0 

1  0 

...  0  0 

• 

lo 

0  0 

...  1  0 ) 

u'  (B'Y  Bs  v 

=  u  P' (H'Y(PP')~l 

LPPr. 

'  Gil 

G\2 

Gik 

( PP'y 1  =  G 

- 

G2i 

G2  2 

G2k 

-  G/ci 

Gk  2 

Gkk  - 

1  we  have 

m  =  ajj  +  a;-1 

Y 

+  ... 

+Ar(pfc_i)( 

s  ] 

=  \%  i  +  \t  1  ^ 

,Pk  -  l) 

Lpr 1 
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(A.7) 


[H'k)rGktH( 


\r  \a 
AkA( 


G^e  +  Afc  1  (  ]  L'kGk{  +  \(  1  (  )  Gk(L(  + 


+A‘rA‘-*C;.1)(w:1)(£»w-,o“£f- 

=  A£A#M(r,s). 


Let  Pu  =  a;,  Pv  =  y  and 


( A.S) 


Q(r,s)  = 


Qll(f.5) 

<?2i(r,s) 


Qi2(r.^) 

Q22(r,s) 


Qihir.s)  - 
Q2l<(r,s) 


L  Q/a(r,s)  Qh'2(r,s) 


Qh'K(r,s)  J 


The  element  <pj(r.  5)  is  a  polynomial  in  r  and  s  of  degree  at  most  p—  1  with  fixed  coefficients. 
Then 


(A. 9) 


|x'Ar+3Q(r,s)y|  <  Ar+*  ^  lly>l 

<0  =  1 

<v+-£  !*“£%?  +  »?) 


<0=1 

<pA'+‘  max  ^Al(||xf  +  ||yl|2: 


Let 

p- 1 

(A. 10) 

9ij{r,s)=  <l9,jr9sh- 

g,h= 0 

Then 

p-1 

(A. 11) 

max  |g,->(r,s)|  <  .  max  ^  I?,5/  |rp_15p_1 

j,A  =  0 

and  |(a:||2  <  |[u||2  times  the  maximum  characteristic  root  of  PP1  and  similarly  for  ||y||2. 


♦  The  lemma  follows. 


I 
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Lemma  8.  (3.2S). 


Proof.  The  left-hand  side  of  (3.28)  is  positive  semidefinite.  Its  trace  is 

(A. 12)  -£tr  Ettr  <  -  jSr  Et\2t-2t2p~2  q*\\xQ\\2 

n 

<=i  t= l 

We  can  take  t0  large  enough  so  that  for  t  >  t0  and  arbitrary  e  >  0,  8  >  0 

(A. 13)  Pr{\2t~2t2p~2q*\\xo\\2  <  e}  >  1  -  6. 


Then  the  right-hand  side  of  (A. 12)  is  with  probability  greater  than  1  —  6  not  greater  than 
(A. 14)  -  V  tr  Et\21~2t2p~2q*\\x0\\2  +  e-  V  tr  T,  ^  t  tr  T 


els  71  — ►  OC. 
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Comments  on  Condition  (2.5) 

A  key  assumption  is 

(A. 15)  sup  £\ v'tvtI(v'tvt  >  0 

f=l,2,... 

as  a  — »  oc;  that  is.  given  e  >  0,  8  >  0  there  exists  a o  such  that  for  a  >  a o 

(A. 16)  Pr  {  sup  £\v'tvtl{v'tvt  >  a^Xt-A  <  e\  >  1  —  S. 

I f=l  .2,...  J 

Let  Il’<(a)  =  £[v'1vtI(v'tvt  >  a)\Tt-\].  The  above  event  for  fixed  a  is 

OC 

(A. 17)  P|{irt(a)  <  e}, 

t=i 

which  is  measurable.  The  random  variable 

(A. 18)  X„ ( a )  =  max  Wt(a) 

has  the  property 

(A. 19)  A'n+i(a)  =  max  [A'„(a),  ITn+1(a)] . 

Note  that  for  given  a  X n{a)  is  nondecreasing  in  n.  The  event  (A.  17)  is 

OO 

(A. 20)  {  j.im_  A'„(a)  <  e|  =  Q  {A'„(a)  <  e). 

n—  1 

Note  that  since  Xn(a)  can  be  defined  by  (A. 19),  it  is  a  one-dimensional  variable;  that  is, 
the  condition  is  a  weak  condition,  not  a  strong  condition.  It  is  a  condition  on  the  cdfs  of 
X„(a). 


30 


References 

Anderson.  T.  \\  .  (1971),  The  Statistical  Analysis  of  Time  Series.  John  Wiley  and  Sons, 
Inc..  New  York. 

Anderson.  T.  W.  (1959).  On  the  asymptotic  distribution  of  estimates  of  parameters  of 
stochastic  difference  equations.  Annals  of  Mathematical  Statistics ,  30,  676-6S7. 

Chan.  X.  H..  and  C.  Z.  Wei  (1987).  Asymptotic  inference  in  nearly  non-stationarv  AR(1) 
processes.  Annals  of  Statistics.  15.  1050-1063. 

Chow.  Y.  W..  and  Henry  Teicher  (198S).  Probability  Theory:  Independence.  Interchange¬ 
ability.  Martingales  (Second  Edition).  Springer- Yerlag,  New  York. 

Dvoretzky,  Aryeh  (1972).  Asymptotic  normality  for  sums  of  dependent  random  variables. 
Proceedings  of  the  Sixth  Berkeley  Symposium  on  Mathematical  Statistics  and  Probabil¬ 
ity.  Volume  2.  University  of  California  Press.  Berkeley  and  Los  Angeles.  513-535. 

Hall.  P..  and  C.  C.  Heyde  (1980).  Martingale  Limit  Theory  and  Its  Applications.  Academic 
Press.  New  York. 

Lai.  Tse-Leung.  and  Herbert  Robbins  (1981).  Consistency  and  asymptotic  efficiency 
of  slope  estimates  in  stochastic  approximation  schemes.  Zeitschrift  fur  Wahrschein- 
lichkeitstheorie.  56.  329-360. 

Lai.  Tse-Leung.  and  David  Siegmund  (19S3).  Fixed  accuracy  estimation  of  an  autoregres¬ 
sive  parameter.  Annals  of  Statistics.  11.  478-485. 

Lai,  Tse-Leung,  and  C.  Z.  Wei  (19S3).  Least  squares  estimates  in  stochastic  regression 
models  with  applications  to  identification  and  control  of  dynamic  systems.  Annals  of 
Statistics.  10.  154-166. 

Lindeberg.  J.  W.  (1922).  Eine  neue  Herleitung  des  exponentialgesetzes  in  der  Wahrschein- 
lichkeitsrechnung.  Mathematische  Zeitschrift.  15,  211-225. 

Mann.  H.  B..  and  A.  Wald  (1943).  On  the  statistical  treatment  of  linear  stochastic  differ¬ 
ence  equations.  Econometrica.  11,  173-220. 


31 


Technical  Reports 
U.S.  Army  Research  Office 

Contracts  DAAG29-82-K-0156,  DAAG29-85-K-0239,  and  DAAL03-89-K-0033 

1.  "Maximum  Likelihood  Estimators  and  Likelihood  Ratio  Criteria  for  Multivariate  Elliptically 
Contoured  Distributions.”  T.  \Y.  Anderson  and  Kai-Tai  Fang,  September  1982. 

2.  “A  Review  and  Some  Extensions  of  Takemura’s  Generalizations  of  Cochran’s  Theorem,”  George 
P.H.  Styan,  September  1982. 

3.  “Some  Further  Applications  of  Finite  Difference  Operators,”  Kai-Tai  Fang.  September  1982. 

4.  ‘’Rank  Additivity  and  Matrix  Polynomials,”  George  P.H.  Styan  and  Akimichi  Takemura. 
September  19S2. 

5.  “The  Problem  of  Selecting  a  Given  Number  of  Representative  Points  in  a  Normal  Population 
and  a  Generalized  Mills'  Ratio.”  Kai-Tai  Fang  and  Shu-Dong  He,  October  1982. 

6.  “Tensor  Analysis  of  A  NOVA  Decomposition,”  Akimichi  Takemura,  November  1982. 

7.  “A  Statistical  Approach  to  Zonal  Polynomials,”  Akimichi  Takemura,  January  1983. 

8.  “Orthogonal  Expansion  of  Quantile  Function  and  Components  of  the  Shapiro-Francia  Statis¬ 
tic,”  Akimichi  Takemura,  January  1983. 

9.  “An  Orthogonally  Invariant  Minimax  Estimator  of  the  Covariance  Matrix  of  a  Multivariate 
Normal  Population,”  Akimichi  Takemura,  April  1983. 

10.  “Relationships  Among  Classes  of  Spherical  Matrix  Distributions,”  Kai-Tai  Fang  and  Han-Feng 
Chen.  April  1984. 

11.  “A  Generalization  of  Autocorrelation  and  Partial  Autocorrelation  Functions  Useful  for  Identi¬ 
fication  of  ARMA(p.q)  Processes,”  Akimichi  Takemura,  May  1984. 

12.  “Methods  and  Applications  of  Time  Series  Analysis  Part  II:  Linear  Stochastic  Models.”  T.  W. 
Anderson  and  N.  D.  Singpurwalla,  October  1984. 

13.  “Why  Do  Noninvertible  Estimated  Moving  Averages  Occur?”  T.  W.  Anderson  and  Akimichi 
Takemura.  November  1984. 

14.  "Invariant  Tests  and  Likelihood  Ratio  Tests  for  Multivariate  Elliptically  Contoured  Distribu¬ 
tions.”  Huang  Hsu,  May  1985. 

15.  “Statistical  Inferences  in  Cross-lagged  Panel  Studies,”  Lawrence  S.  Mayer,  November  1985. 

16.  “Notes  on  the  Extended  Class  of  Stein  Estimators,”  Suk-ki  Hahn,  July  1986. 

17.  “The  Stationary  Autoregressive  Model,”  T.  W.  Anderson,  July  1986. 

18.  “Bayesian  Analyses  of  Nonhomogeneous  Autoregressive  Processes,"  T.  \V.  Anderson,  Nozer 
D.  Singpurwalla,  and  Refik  Soyer,  September  1986. 

19.  “Estimation  of  a  Multivariate  Continuous  Variable  Panel  Model,”  Lawrence  S.  Mayer  and  Kun 
Mao  Chen,  June  1987. 

20.  “Consistency  of  Invariant  Tests  for  the  Multivariate  Analysis  of  Variance,”  T.  W.  Anderson 
and  Michael  D.  Perlman,  October  1987. 

21.  “Likelihood  Inference  for  Linear  Regression  Models,”  T.  J.  DiCiccio,  November  1987. 

22.  “Second-order  Moments  of  a  Stationary  Markov  Chain  and  Some  Applications,”  T.  W.  An¬ 
derson,  February  1989. 

23.  “Asymptotic  Robustness  in  Regression  and  Autoregression  Based  on  Lindeberg  Conditions,” 
T.  \V.  Anderson  and  Naoto  Kunitomo,  June  1989. 


